Speaker identification using probabilistic PCA model selection
نویسندگان
چکیده
Gaussian mixture model (GMM) techniques are popular for speaker identification. Theoretically, each Gaussian function should have a full covariance matrix. However, the diagonal covariance matrix is usually used because the inverse of diagonal covariance matrix can be easily calculated via expectation maximization (EM) algorithm. This paper proposes a new probabilistic principal component analysis (PPCA) model for speaker identification. The full covariance of speaker’s data is considered. This model is originated from factor analysis theory. The probability distributions using PPCA are well defined. In particular, GMM and PPCA are found to be equivalent when using diagonal covariance matrix. In this study, we derive a novel PPCA model selection and establish models for different speakers. Applying PPCA model selection, we can dynamically determine the numbers of speech features and mixture components. Experiments show that PPCA achieves desirable speaker recognition performance with proper model regularization.
منابع مشابه
Dimension reduction for speaker identification based on mutual information
Dimension reduction is a necessary step for speech feature extraction in a speaker identification system. Discrete Cosine Transform (DCT) or Principal Component Analysis (PCA) is widely used for dimension reduction. By choosing basis vectors from basis vector pool of DCT or PCA which contribute more to data distribution variance or reconstruction accuracy of speech data set, we can transform th...
متن کاملPCA Fuzzy Mixture Model for Speaker Identification
In this paper, we proposed the principal component analysis (PCA) fuzzy mixture model for speaker identification. A PCA fuzzy mixture model is derived from the combination of the PCA and the fuzzy version of mixture model with diagonal covariance matrices. In this method, the feature vectors are first transformed by each speakers PCA transformation matrix to reduce the correlation among the el...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملNeural Network based Classification for Speaker Identification
Speaker Recognition is a challenging task and is widely used in many speech aided applications. This study proposes a new Neural Network (NN) model for identifying the speaker, based on the acoustic features of a given speech sample extracted by applying wavelet transform on raw signals. Wrapper based feature selection applies dimensionality reduction by kernel PCA and ranking by Info gain. Onl...
متن کاملSpeaker identification using relaxation labeling
A nonlinear probabilistic model of the relaxation labeling (RL) process is implemented in the speaker identification task in order to disambiguate the labeling of the speech feature vectors. Identification rates using the RL are higher than those using the conventional VQ (vector quantization) method.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004